WASP: Scalable Bayes via barycenters of subset posteriors

نویسندگان

  • Sanvesh Srivastava
  • Volkan Cevher
  • Quoc Tran-Dinh
  • David B. Dunson
چکیده

The promise of Bayesian methods for big data sets has not fully been realized due to the lack of scalable computational algorithms. For massive data, it is necessary to store and process subsets on different machines in a distributed manner. We propose a simple, general, and highly efficient approach, which first runs a posterior sampling algorithm in parallel on different machines for subsets of a large data set. To combine these subset posteriors, we calculate the Wasserstein barycenter via a highly efficient linear program. The resulting estimate for the Wasserstein posterior (WASP) has an atomic form, facilitating straightforward estimation of posterior summaries of functionals of interest. The WASP approach allows posterior sampling algorithms for smaller data sets to be trivially scaled to huge data. We provide theoretical justification in terms of posterior consistency and algorithm efficiency. Examples are provided in complex settings including Gaussian process regression and nonparametric Bayes mixture models.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

THE EMPIRICAL BAYES METHOD OF ANALYSIS OF A SERIES OF EXPERIMENTS

The classical method of analysis of a series of experiments is somewhat involved in being conditional on various, occasionally unrealistic, assumptions such as homogeneity of variances of experimental error, lack of interactions of treatments and places,etc. In this work, we adopt a Bayesian view to account for such heterogeneities. Our appoach is illustrated by a real series of experiment...

متن کامل

Empirical Bayes interval estimates that are conditionally equal to unadjusted confidence intervals or to default prior credibility intervals.

Problems involving thousands of null hypotheses have been addressed by estimating the local false discovery rate (LFDR). A previous LFDR approach to reporting point and interval estimates of an effect-size parameter uses an estimate of the prior distribution of the parameter conditional on the alternative hypothesis. That estimated prior is often unreliable, and yet strongly influences the post...

متن کامل

Group invariant inferred distributions via noncommutative probability

Abstract: One may consider three types of statistical inference: Bayesian, frequentist, and group invariance-based. The focus here is on the last method. We consider the Poisson and binomial distributions in detail to illustrate a group invariance method for constructing inferred distributions on parameter spaces given observed results. These inferred distributions are obtained without using Ba...

متن کامل

PAC-Bayes Risk Bounds for Stochastic Averages and Majority Votes of Sample-Compressed Classifiers

We propose a PAC-Bayes theorem for the sample-compression setting where each classifier is described by a compression subset of the training data and a message string of additional information. This setting, which is the appropriate one to describe many learning algorithms, strictly generalizes the usual data-independent setting where classifiers are represented only by data-independent message...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015